AITopics | robust markov decision process

Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes

Neural Information Processing SystemsNov-20-2025, 22:28:53 GMT

What policy should be employed in a Markov decision process with uncertain parameters? Robust optimization answer to this question is to use rectangular uncertainty sets, which independently reflect available knowledge about each state, and then obtains a decision policy that maximizes expected reward for the worst-case decision process parameters from these uncertainty sets. While this rectangularity is convenient computationally and leads to tractable solutions, it often produces policies that are too conservative in practice, and does not facilitate knowledge transfer between portions of the state space or across related decision processes. In this work, we propose non-rectangular uncertainty sets that bound marginal moments of state-action features defined over entire trajectories through a decision process. This enables generalization to different portions of the state space while retaining appropriate uncertainty of the decision process. We develop algorithms for solving the resulting robust decision problems, which reduce to finding an optimal policy for a mixture of decision processes, and demonstrate the benefits of our approach experimentally.

decision process, policy-conditioned uncertainty set, robust markov decision process, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.44)

Add feedback

Reinforcement Learning in Robust Markov Decision Processes

Shiau Hong Lim, Huan Xu, Shie Mannor

Neural Information Processing SystemsOct-3-2025, 06:42:25 GMT

Neural Information Processing Systems http://nips.cc/

reinforcement learning, robust markov decision process

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.40)

Add feedback

Reinforcement Learning in Robust Markov Decision Processes

Neural Information Processing SystemsSep-30-2025, 11:03:55 GMT

An important challenge in Markov decision processes is to ensure robustness with respect to unexpected or adversarial system behavior while taking advantage of well-behaving parts of the system. We consider a problem setting where some unknown parts of the state space can have arbitrary transitions while other parts are purely stochastic. We devise an algorithm that is adaptive to potentially adversarial behavior and show that it achieves similar regret bounds as the purely stochastic case.

name change, reinforcement learning, robust markov decision process, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Efficient and Sharp Off-Policy Evaluation in Robust Markov Decision Processes

Neural Information Processing SystemsMay-27-2025, 16:56:28 GMT

We study the evaluation of a policy under best- and worst-case perturbations to a Markov decision process (MDP), using transition observations from the original MDP, whether they are generated under the same or a different policy. This is an important problem when there is the possibility of a shift between historical and future environments, \emph{e.g.} due to unmeasured confounding, distributional shift, or an adversarial environment. We propose a perturbation model that allows changes in the transition kernel densities up to a given multiplicative factor or its reciprocal, extending the classic marginal sensitivity model (MSM) for single time-step decision-making to infinite-horizon RL. We characterize the sharp bounds on policy value under this model -- \emph{i.e.}, the tightest possible bounds based on transition observations from the original MDP -- and we study the estimation of these bounds from such transition observations. We develop an estimator with several important guarantees: it is semiparametrically efficient, and remains so even when certain necessary nuisance functions, such as worst-case Q-functions, are estimated at slow, nonparametric rates.

efficient and sharp off-policy evaluation, robust markov decision process, transition observation, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Decision Support Systems (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.64)

Add feedback

Reviews: Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes

Neural Information Processing SystemsOct-7-2024, 15:21:05 GMT

The authors consider distributionally robust finite MDPs over a finite horizon. The transition probabilities conditionally to a state-action pair should remain at L1-bounded distance from a base measure, which is feasible as being generated using a given reference policy. This is a nice idea. A few comments are mentioned next. Related to that question, why the requirement of staying "close" to this policy would be beneficial.

policy-conditioned uncertainty set, reference policy, robust markov decision process, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.72)

Add feedback

Robust $Q$-learning Algorithm for Markov Decision Processes under Wasserstein Uncertainty

Neufeld, Ariel, Sester, Julian

arXiv.org Artificial IntelligenceJan-5-2023

We present a novel $Q$-learning algorithm to solve distributionally robust Markov decision problems, where the corresponding ambiguity set of transition probabilities for the underlying Markov decision process is a Wasserstein ball around a (possibly estimated) reference measure. We prove convergence of the presented algorithm and provide several examples also using real data to illustrate both the tractability of our algorithm as well as the benefits of considering distributional robustness when solving stochastic optimal control problems, in particular when the estimated distributions turn out to be misspecified in practice.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2210.00898

Country:

Asia > Singapore (0.14)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry: Banking & Finance (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.63)

Add feedback

Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes

Tirinzoni, Andrea, Petrik, Marek, Chen, Xiangli, Ziebart, Brian

Neural Information Processing SystemsFeb-14-2020, 20:26:58 GMT

What policy should be employed in a Markov decision process with uncertain parameters? Robust optimization answer to this question is to use rectangular uncertainty sets, which independently reflect available knowledge about each state, and then obtains a decision policy that maximizes expected reward for the worst-case decision process parameters from these uncertainty sets. While this rectangularity is convenient computationally and leads to tractable solutions, it often produces policies that are too conservative in practice, and does not facilitate knowledge transfer between portions of the state space or across related decision processes. In this work, we propose non-rectangular uncertainty sets that bound marginal moments of state-action features defined over entire trajectories through a decision process. This enables generalization to different portions of the state space while retaining appropriate uncertainty of the decision process.

decision process, policy-conditioned uncertainty set, robust markov decision process, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.65)

Add feedback

Reinforcement Learning in Robust Markov Decision Processes

Lim, Shiau Hong, Xu, Huan, Mannor, Shie

Neural Information Processing SystemsFeb-14-2020, 15:13:19 GMT

An important challenge in Markov decision processes is to ensure robustness with respect to unexpected or adversarial system behavior while taking advantage of well-behaving parts of the system. We consider a problem setting where some unknown parts of the state space can have arbitrary transitions while other parts are purely stochastic. We devise an algorithm that is adaptive to potentially adversarial behavior and show that it achieves similar regret bounds as the purely stochastic case. Papers published at the Neural Information Processing Systems Conference.

reinforcement learning, robust markov decision process

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

Add feedback

Filters

Collaborating Authors

robust markov decision process

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes

Reinforcement Learning in Robust Markov Decision Processes

Reinforcement Learning in Robust Markov Decision Processes

Efficient and Sharp Off-Policy Evaluation in Robust Markov Decision Processes

Reviews: Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes

Robust $Q$-learning Algorithm for Markov Decision Processes under Wasserstein Uncertainty

Policy-Conditioned Uncertainty Sets for Robust Markov Decision Processes

Reinforcement Learning in Robust Markov Decision Processes